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1  Introduction 


Lord  has  proposed  and  discussed  a  bias  function  of  the  maximum  likelihood  estimate  in  the  context 
of  the  three-parameter  logistic  model  (cf.  Lord,  1983).  In  so  doing,  he  used  Taylor’s  expansion  of  the 
likelihood  equation  and  proceeded  from  there,  obtained  an  equation  which  includes  the  conditional 
expectation  of  the  discrepancy  between  the  maximum  likelihood  estimate  and  the  true  ability,  and 
ignored  all  terms  of  orders  higher  than  n-1  ,  where  n  indicated  the  number  of  items. 

Let  9  be  ability,  or  latent  trait,  which  assumes  any  real  number.  Let  g  (=  1,2,  ...,n)  denote 
an  item,  kg  be  a  discrete  response  to  item  g  ,  and  Pkg(9)  denote  the  operating  characteristic  of  the 
discrete  response  kg  ,  or  the  conditional  probability,  given  9  ,  with  which  the  examinee  responds  to 
item  g  with  kg  .  The  item  response  information  function,  Ikg(9)  ,  is  defined  by 

(1-1)  h,(<>)  =  Pkg(6)  , 

and  the  item  information  function  Ig(6)  is  the  conditional  expectation  of  the  item  response  information 
function,  given  9  ,  so  that  we  can  write 

(12)  I,(9)  =  E\Ik'(0)\e)  =  'Elk'{0)Pk'(O)  . 

kg 

On  the  dichotomous  response  level  (Samejima,  1972),  the  set  of  operating  characteristics  for  item  g 
is  represented  by  a  single  function,  i.e.,  the  operating  characteristic  of  the  positive  response,  which  is 
called  the  item  characteristic  function,  or  item  response  function1. 

Let  Pg(9)  be  the  item  characteristic  function  in  the  three-parameter  logistic  model,  which  is  given 

by, 

(1.3)  Pg{9)  =  cg  +  (1  -  cg){\  +  exp{-Dag[9  -  &„)}]~l  , 

where  ag  ,  bg  and  cg  are  the  item  discrimination,  difficulty  and  guessing  parameters,  and  D  is  a 
scaling  factor,  which  is  set  equal  to  1.7  when  the  logistic  model  is  used  as  a  substitute  for  the  normal 
ogive  model.  Lord’s  bias  function  B(9)  can  be  written  as 

(1.4)  B{9)  =  D\I(9))-*j^aglg{d)\*g[e)-(\l2)\  , 

0=1 

where 

(1.5)  Vg(9)  =  \1  +  exp{-Dag(9  -  bg)}}~1  , 

and  I(J(9)  and  1(9)  are  the  item  information  function  and  the  test  information  function,  respectively, 
which  can  be  given  by 

(i-6)  , 

AThe  term,  item  response  function,  has  been  widely  used  in  recent  years  by  researchers  who  deal  solely  with  the 
dichotomous  response  level.  From  the  more  comprehensive  standpoint,  however,  this  term  is  ambiguous  and  misleading, 
and  not  appropriate  to  use  On  the  graded  response  level,  for  example,  there  may  be  much  more  than  two  item  response 
categories,  or  there  may  even  be  an  infinite  number  of  response  categories,  and  the  use  of  item  response  function  for 
one  of  these  many  response  categories  is  not  justifiable  For  this  reason,  throughout  this  paper,  the  original  term,  item 
characteristic  function,  will  be  used  to  indicate  the  conditional  probability  for  the  positive  response,  given  latent  trait,  on 
the  dichotomous  response  level. 


and 


(1-7)  /(*)  =  E '</(*)  > 

0=1 

with  Pg(6)  indicating  the  first  derivative  of  Pg(9)  with  respect  to  9  .  The  former  of  these  two 
formulae  can  be  given  as  a  special  case  of  the  item  information  function  given  by  (1.2),  which  is  defined 
for  the  general  case  of  discrete  responses.  (Incidentally,  in  Lord’s  paper,  Bi(9)  is  used  for  this  bias 
function.  This  is  not  appropriate,  however,  since  it  is  a  function  of  6  itself,  not  of  its  maximum 
likelihood  estimate  9  .) 

2  Rationale 

A  similar  logic  can  be  adopted  for  the  general  case,  in  which  item  responses  are  simply  discrete. 
We  assume  that  there  are  a  finite  or  an  enumerable  number  of  kg ’s  a s  possible  responses  to  item  g  . 
Thus  for  the  set  of  n  items,  we  can  write  for  the  response  pattern  V 


(2.1) 


V  —  ( k  i ,  ,  •  •  •  i  kg , . . . ,  kn ) 


We  assume  that  the  operating  characteristic  Pku(tf)  i*5,  at  least,  three-times  differentiable  with  respect 
to  9  .  By  virtue  of  local  independence,  we  can  write  for  the  likelihood  function 

(2.2)  Lv(9)  =  Pv(9)=  I]  ' 

k,ev 

Thus  the  likelihood  equation  is  given  by 

(2-3)  ^log  M*)==  £  Aiogpfcj(0)^°  . 

We  define  r<kv(0)  such  that 

(24)  r,kA0)  =  £;\ozPk,(9) 

for  a  =  1,2, ...  .  We  notice,  in  particular,  that 

(2.5)  rl*a(0)  =  P'j(0)[f’fc,(0)]-‘  =  Ak'(&)  , 

where  Ak;(6)  ^  the  basic  function  (Samejima,  1969),  and 

(2.0)  r2fcv(«)  =  p^)iPk^rM^(*)]2 

and 

(2.7)  r3ki,(0)  -  Piyy^kSe)]-'  -  3-4k„(0)P^(0)|/V,(0)j-1  +  2|Ak/(0)j3  , 
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where  the  superscripts  '  ,  "  and  indicate  the  first,  second  and  third  partial  derivatives  of  the 
function  with  respect  to  9  ,  respectively.  Thus  from  (2.3)  and  (2.5)  we  can  write 

(2.8)  £  rifc,(*V)=  £  ^,(*V)  =  0  ■ 

k,ev  k,ev 


Let  r,v'(0)  be  defined  by 

(2.9)  r.„(*)=  £  r.fc,(0) 

k,ev 

for  s  =  l,2,....  For  a  fixed  value  of  9  we  can  write  by  Taylor’s  formula 

(2.10)  rlv(*V)  =  iv(0)  +  (8V  -  8)T2V{8)  +  (i/2)(<v  -  0)2r3v(0) 

+  (i/6)(tfV  -  0)3r4v(8)  +  (1/24)(*V  -  0)4rSv(O  =  0 


where  £  is  some  value  between  8  and  9y  . 
Since  we  have 


(2-11)  £/>*,(*)  =  1  , 

kg 

we  obtain 

(2>2)  £  =  0 

kg 

for  s  =  1,2 .  Equation  (2.12)  will  be  helpful  in  following  the  mathematical  derivations  which  are 

needed  in  obtaining  the  bias  function.  The  response  pattern  information  function,  Iv(&)  ,  is  defined 
by 


(2.13) 


Iv(6) 


~log/V(0) 


£  4,(0) 


k,ev 


and  the  test  information  function  1(8)  is  the  conditional  expectation  of  Iy  (9)  ,  given  8  ,  for  which 
we  can  write 


(2.14)  /(0)  =  EI/v-(0)M  =  £M0)/M0)  ■ 

Let  be  any  function  of  6  defined  for  a  specific  discrete  response  kg  .  We  can  write 


£  £  fk,(0)K(9)  =  £  £  fk,(8)Pk,(8)Py_'(8) 
V  k, ev  V  k,ev 


(2.15) 


by  virtue  of  the  fact  that 


(2-16)  £>_,(*)  =  1 

v-, 

where  V_9  is  the  response  pattern  of  (n  —  l)  discrete  item  scores  obtained  by  deleting  kg  from  V  . 
Replacing  fk,{0)  by  7*^(0)  in  (2.15)  and  using  this  result,  (1.2),  (2.13)  and  (2.14),  we  can  obtain  the 
same  equation  as  (1.7). 

Let  t !,g(d)  be  the  conditional  expectation  of  T,ka(6)  ,  given  6  ,  which  can  be  written  as 

(2-17)  'i.A0)  =  Eir.ka(6)\6}  =  '£r,k'(8)Pki(e)  . 

kg 

In  particular,  we  have  from  (2.5),  (2.6),  (2.7)  and  (2.12) 

(2-18)  '71»  =  XX(*)=0  - 

kg 


(2-19) 

kg 

and 

(2.20)  KA9)  =  2^lAkM2pLa(<>)-sY.A'‘A6)pkAe)  ■ 

k  g  kg 

It  is  noted  from  (1.1),  (1.2),  (2.12)  and  (2.17)  that  we  can  also  write 

(2-21)  W*)  =  -M*)  • 


We  further  define  'u(O)  such  that 

(2-22)  *7.(0)  =  (l/«)^7»(? 

17=1 

for  s  =  1,  2, ....  In  particular,  we  have 

(2  23)  7i{*)  =  0  , 


(2.24) 


and 

(2.25) 


72(0)  =  -(l/n)£/#(0)  =  -(l/n)7(0) 

17=1 


73(0)  =  (2/n)  £  Yi\A^M3pkt{«)  -  (3/n)  £  £  Afcf  (fl)f?f («) 


0=1 


a 

i 

i 

i 

i 

i 

i 

i 

a 

a 


1 

1 

I 


and 


S 

1 

1 


R 

1 

I 

E 

1 

1 

B 


(2-27)  «.v(tf)  =  (l/n)  £  £.fcf(0)  , 

k,ev 

respectively.  In  particular,  we  can  write  from  these  definitions,  (1.7),  (2.5),  (2.6),  (2.18)  and  (2.21) 

(2-28)  <!*,(*)  =  rlfc,(tf)=jifcf(0)  , 


(2-29)  e2kg(e)  =  PLymM'1  ~  (>M*)P  +  4(«)  . 

(2.30)  («)=(!/»)  E  **.(*) 

kgev 

and 

(2.31)  €2v(0)  =  (1/n)  2  iff, Wlft.Wr1  -  (1/n)  £  H*,(0)P  +  (l/»)/(0)  • 

k„ev  k,ev 

We  can  also  obtain  from  (2.15),  (2.17),  (2.22),  (2.26)  and  (2.27)  for  the  conditional  expectation  of 
e,v(0)  ,  given  0  , 

(2.32)  Eie.v(e)\e]  =  1£e.v(e)fy(e)  =  1,(e)-'1,(6)  =  o  . 

v 

With  these  definitions  of  7,  (9)  and  e,v(0)  and  from  (2.10)  we  have 

(2.33)  eiv(0)  +  (§v  -  0)[72(0)  +  «2v(0)l  +  (1/2) (Ov  -  0)2[73(0)  +  e3v(0)] 

+  (1/6)(0V  -  0)3[74(0)  +£4^(0)]  +  (1/24) (^v  -  0)4IV(0)  =  0  , 


and  proceeding  from  here  by  taking  the  conditional  expectation  of  each  term  in  (2.33)  with  respect  to 
V  ,  given  6  ,  and  ignoring  all  terms  whose  orders  are  higher  than  r»-1  ,  we  obtain 

(2.34)  E\t tv  (9)  |  0]  +  72(0)£[0v  -  0  |  0)  +  E[(SV  -  0)e2V(0)  |  0]  +  (l/2)7s(0)£[(0V  -  0)2  |  0)  =  0  . 


It  is  obvious  from  (2.32)  that  the  first  term  on  the  left  hand  side  of  (2.34)  disappears.  As  for  the 
fourth  and  last  term  in  (2.34),  we  can  use  the  asymptotic  variance  of  the  distribution  of  the  maximum 
likelihood  estimate  as  the  approximation  to  its  last  factor,  i.e., 

(2.35)  E[(9V  -  0)2  |  0]  =  [/(0)]— 1  . 


Since  72(0)  and  13(0)  are  given  by  (2.24)  and  (2.25),  respectively,  all  we  need  to  do  is  to  evaluate 
the  third  term  on  the  left  hand  side  of  (2.34)  in  the  general  framework.  In  so  doing  we  need  to  multiply 

(2.33)  by  «2v(0)  ,  take  its  expectation  with  respect  to  V  and  ignore  all  terms  of  o(n_1)  ,  to  obtain 

(2.36)  E\(6V  -  B)e2V  (0)  |  0]  =  -  [72(0)!“ 1  {&)^v  (0)  |  0J  . 


g 
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Thus  the  remaining  task  is  to  evaluate  the  second  factor  of  the  right  hand  side  of  (2.36).  FYom  (2.30) 
and  (2.31)  we  have 

(2.37)  E\e IV  (6)ew  (*)  |  *1  =  E  «iv  (*)«w  (0)*V  (*) 

v 

=  (i/*2)E  E  A"Ae)  E 

v  k,ev  kKev 

-  (1  ME  E  'M*)  E  l^fc(«)laAr(J) 

v  kg€V  kKev 

9=  1  kg 

It  is  obvious  from  (2.32)  that  the  third  term  on  the  right  hand  side  of  (2.37)  disappears.  We  can  write 
by  virtue  of  (2.15) 

(2-38)  £  E  **,{*)  E 

v  kgev  kKev 

=  EE 

v  kgev 

+E  E  E  n'MPk.m-'^w 

V  kgev  kkev 

=  EE^(^,w 

9=1  k , 

+  E  E  E  PkMPkAe))-1^)  ■ 

V  kgev  kKev 

h^a 

It  is  also  obvious  from  (2.5),  (2.12)  and  (2.15)  that  we  can  further  rewrite  the  second  term  of  the  Tightest 
hand  side  of  (2.38)  in  such  a  way  that 


(2-31  >  E  E  'M*)M*)  E 

v  kgev  khev 

h^g 


=  EE^w^(«)E  E  n'MPkkm~lp^g(0) 

9=1  kg  v^gkKev.g 

=  EE^)EE^(«)  =  °  • 

9=1  kg  h?g  kk 

Following  a  similar  process,  we  have 


(2  40)  £  ]T  Ak,(9)  Y,  I^MPA'W  =  EE  I'M*)!2  E  'M'W*) 

v  fc,ev  fckev  v  *,6^  *kev 

=  E  EM*;(*)i3pv(«) 

v  kgev 


+£  £m*,(*))2  £  *MMi) 

v  k,ev  kKev 

h?g 

>£D-MS>IX«  • 

9=1  k , 

Substituting  these  results  into  (2.37)  and  rearranging,  we  obtain 

(2.41)  E{elv(9)e2V(0)  |  B\  =  (l/»2)  EE^(«)(^ (*)  ~  ^{9)^(9)]  . 

9=1  *» 

Thus  we  can  write  from  this  result,  (2.24)  and  (2.36) 

(2.42)  E[{$v-i)e 2v(6)  \  9}  =  (l/n)[J(*)]-1  £  £  Ak, (9)[Pgt(9)  -  Akg{9)P'kg(9)\  , 

9  =  1  k, 

where  (9)  and  P'k  (0)  indicate  the  first  and  second  derivatives  of  Pkg[9)  with  respect  to  9  , 
respectivefy.  Substituting  (2.21),  (2.22),  (2.35)  and  (2.42)  into  (2.34)  and  rearranging,  we  obtain  for 
the  bias  function,  B(6)  ,  of  the  maximum  likelihood  estimate 

(2.43)  B(6)  =  E[§v  -9\e\  =  — (l/2)[/(0)j-2  £  £  ^,{9)^(9) 

9=1  k g 

=  -(i/2)[ /wraE£^fw^(«)[ft.(*)]-1  • 

9=1  k g 

It  is  obvious  from  this  result  that  the  bias  of  the  maximum  likelihood  estimate  on  the  discrete 
response  level  has  the  negative  relationship  with  the  amount  of  test  information,  i.e.,  we  can  expect  a 
small  amount  of  bias  when  the  amount  of  test  information  is  large,  and  vice  versa.  The  relationship 
is  rather  complicated,  however,  because  of  the  numerator  of  the  Tightest  hand  side  of  (2.43),  which 
includes  Pkg(9)  and  its  first  and  second  derivatives  with  respect  to  6  . 

On  the  graded  response  level,  where  item  score  xg  assumes  successive  integers,  0  through  mg  , 
each  kg  in  (2.43)  must  be  replaced  by  xg  .  On  the  dichotomous  response  level,  it  can  be  reduced  to 
the  form 

(2.44)  B(9)  =  E\Sv-9\9)  =  {-l/2)\mr2f^P^e)P^(9)[Pg(9)Qg(e)}-1  , 

0=1 


where 

(2-45)  Qe(9)  =  1  -  Pa[9)  , 


with  Pg(9)  indicating  the  second  derivative  of  Pg(9)  with  respect  to  6  .  When  Pg(9)  is  nonzero 
throughout  the  entire  range  of  9  ,  we  can  also  write 


(2.46)  B(9)  =  E[9V  -9\9\  =  (-l/2)[/(0)]-2  £  Wl'1  ■ 

9=1 

This  includes  Lord’s  bias  function  in  the  three-parameter  logistic  model  as  a  special  case. 
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3  Dichotomous  Response  Level 

On  the  dichotomous  response  level  where  we  deal  only  with  two  item  score  categories,  as  is  exempli¬ 
fied  by  such  a  pair  as  “right”  and  “wrong”,  or  “agree”  or  “disagree”,  the  most  commonly  used  family  of 
models  may  be  the  one  in  which  the  item  characteristic  function  is  strictly  increasing  in  6  .  In  such  a 
case,  Pg{9)  ,  the  first  derivative  of  the  item  characteristic  function  with  respect  to  8  ,  is  nonnegative 
If,  in  addition,  Pg(9)  is  unimodal,  as  is  the  case  with  many  commonly  used  mathematical  models, 
the  second  derivative,  Pg(9)  ,  assumes  positive  values  up  to  the  modal  point,  and  then  it  has  negative 
values.  A  close  examination  of  (2.46)  reveals  that,  in  such  a  model,  the  direction  of  bias  is  positive  for 
very  high  levels  of  6  ,  and  it  is  negative  for  very  low  levels  of  9  .  In  other  words,  individuals  of 
very  high  levels  of  ability  tend  to  be  overevaluated,  and  those  of  very  low  levels  of  ability  tend  to  be 
underevaluated. 

Now  we  shall  observe  the  bias  functions  in  some  specified  models  which  belong  to  this  family. 

3.1  Normal  Ogive  Model 

In  the  normal  ogive  model,  the  item  characteristic  function  is  given  by 

(3.1)  PB(*)  =  ( 2x)~l/2  ^  du  , 

J  —  OO 

where  ag  and  bg  are  the  item  discrimination  and  difficulty  parameters,  respectively.  Prom  (3.1),  we 
can  write  for  the  first  and  second  derivatives  of  Pg(9)  with  respect  to  9 

(3.2)  Pj(fl)  =ag(2ir)-1^e-a>-h^^ 
and 

(3.3)  W  =  -«2(*-MW  - 

respectively.  Substituting  (3.2)  and  (3.3)  into  (2.46)  and  rearranging,  we  obtain  for  the  bias  function 

n 

(3.4)  B(e)  =  (l/2)[I(9))-2^2al(e-b9)Ig(e)  . 

0=1 

It  is  obvious  from  its  definition,  which  is  given  by  (1.6),  that  the  item  information  function,  Ig(6)  , 
is  nonnegative  regardless  of  the  mathematical  model.  FYom  this,  we  can  see  that  for  a  fixed  value 
of  6  the  sign  of  the  term  under  the  summation  sign  in  (3.4)  depends  upon  the  value  of  the  difficulty 
parameter  bg  of  each  item  g.  We  can  also  see  that,  if  all  the  n  items  have  the  same  values  of  difficulty 
parameter,  i.e.,  b  =  b:  =  fi2  =  . . .  =■  ,  then  the  bias  function  is  strictly  increasing  in  9  ,  and  equals 

zero  only  at  9  =  b  ,  with  positive  and  negative  infinities  as  its  two  asymptotes.  Moreover,  since  P,,(9) 
is  point-symmetric  with  (fc,  0.5)  as  the  point  of  symmetry,  the  bias  function  is  also  point-symmetric 
with  the  same  point  of  symmetry.  In  this  situation,  generally  speaking,  we  should  expect  a  substantial 
amount  of  bias  as  we  depart  from  9  —  b  . 

In  many  situations  of  practical  importance,  however,  it  is  desirable  to  have  a  test  whose  bias  function 
practically  assumes  zero  for  a  wide  range  of  9  .  Equation  (3.4)  suggests  that,  in  order  to  materialize 
such  a  test,  we  must  develop  a  set  of  items  whose  difficulty  parameters  distribute  widely  and  evenly,  so 
that,  for  a  wide  interval  of  9  ,  the  negative  and  positive  terms  under  the  summation  sign  of  the  right 
hand  side  of  (3.4)  practically  “cancel  each  other  out”. 


Table  3-1  presents  the  estimated  item  discrimination  parameter  ag  and  item  difficulty  parameter 
bg  for  each  of  the  forty-three  dichotomous  test  items  of  the  Level  11  Vocabulary  Subtest  of  the  Iowa 
Tests  of  Basic  Skills,  which  were  obtained  by  assuming  the  normal  ogive  model  (Samejima,  1984a).  The 
data  were  collected  for  2,356  school  children  of  approximately  age  eleven  by  the  Iowa  Testing  Bureau, 
and  were  analyzed  by  using  the  tetrachoric  correlation  matrix  and  the  principal  factor  solution  of  factor 
analysis.  Thus  we  have  for  the  item  parameter  estimates 

(3.5)  ag=pg(  l-p2g)-1/2 

and 

(3.6)  h=~lgPg 1  . 

where  pg  is  the  factor  loading  of  item  g  on  the  single,  dominating  common  factor,  which  is  operationally 
defined  as  9  ,  and  is  the  normal  deviate  corresponding  to  the  proportion  correct,  pg  ,  of  item  g  . 
Corresponding  results  for  each  of  the  fifty-five  dichotomous  test  items  of  Test  Jl  of  Shiba’s  Word/Phrase 
Comprehension  Tests  are  also  presented  as  Table  3-2  (cf.  Shiba,  1978,  Samejima,  1984b).  Those  data 
were  based  upon  2,259  junior  high  school  students  in  Japan. 

Figure  3-1  presents  the  square  root  of  the  test  information  function  of  each  of  these  two  tests  by 
solid  and  dashed  lines,  respectively.  We  can  see  that  these  curves  are  fairly  similar.  The  bias  functions, 
which  were  obtained  by  (3.4)  for  the  two  tests,  are  shown  in  Figure  3-2.  We  can  see  that,  over  all,  the 
bias  is  less  conspicuous  for  Test  Jl  than  for  Iowa  Subtest.  In  order  to  show  the  relationship  between 
the  amount  of  bias  and  that  of  test  information,  Figure  3-3  presents  the  square  root  of  test  information 
and  the  bias  function  together  for  each  of  the  two  tests.  It  is  interesting  to  note  that  in  both  cases,  if  we 
tolerate  biases  of  up  to  ±0.1  ,  for  example,  then  the  range  of  9  in  which  this  is  the  case  corresponds 
to  the  interval  of  9  where  the  square  root  of  test  information  is  approximately  1.75  or  greater,  or 
where  the  amount  of  test  information  is  approximately  3.0  or  greater. 

3.2  Logistic  Model 

Since  the  logistic  model  can  be  considered  as  a  special  case  of  the  three-parameter  logistic  model 
in  which  we  set  the  guessing  parameter,  cg  ,  equal  to  zero,  Lord’s  bias  function,  which  is  written  as 
formula  (1.4)  in  the  present  paper,  is  also  applicable.  Note,  however,  that  neither  Ig(9)  nor  1(9 )  in 
the  formula  includes  the  guessing  parameter  cg  ,  when  it  is  used  for  the  logistic  model. 

A  close  examination  of  (1.4)  reveals  strong  similarities  of  the  logistic  model  with  the  normal  ogive 
model,  i.e.,  1)  for  a  fixed  value  of  9  ,  the  sign  of  the  term  under  the  summation  sign  in  (1.4)  depends 
upon  the  difficulty  parameter  bg  of  each  item  g  ;  2)  if  b  =  bi  =  b?  =  . . .  =  bn  ,  the  bias  function  is 
strictly  increasing  in  6  with  positive  and  negative  infinities  as  its  two  asymptotes,  which  equals  zero 
only  at  9  =  b  ,  and  is  point-symmetric  with  (6,0.5)  as  the  point  of  symmetry;  and  3)  in  order  to 
make  the  bias  practically  nil  for  a  wide  range  of  9  ,  we  must  develop  items  whose  difficulty  parameters 
distribute  widely  and  evenly. 

Figures  3-4  and  3-5  present  two  sets  of  examples  of  the  square  root  of  the  test  information  function 
and  the  bias  function  obtained  by  following  the  logistic  model  with  D  =  1.7  ,  by  solid  and  dashed  lines, 
respectively.  They  are  the  results  obtained  by  using  the  same  set  of  estimated  discrimination  and 
difficulty  parameters  of  the  Iowa  Level  11  Vocabulary  Subtest  and  Shiba’s  Test  Jl,  which  are  shown 
in  Tables  3-1  and  3-2,  respectively,  as  we  did  for  the  normal  ogive  model  in  the  preceding  section.  As 
is  expected,  these  results  are  close  to  those  obtained  by  following  the  normal  ogive  model.  As  was 
done  in  the  normal  ogive  model,  the  square  root  of  test  information  and  the  bias  function  are  put 
together  for  each  of  the  two  tests,  and  presented  in  Figure  3-6.  The  relationship  between  the  square 
root  of  test  information  and  the  amount  of  bias  appears  to  be  almost  the  same  as  was  recognized  in  the 
corresponding  result  obtained  by  following  the  normal  ogive  model,  which  we  discussed  in  the  preceding 
section. 
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TABLE  3-2 


Estimated  Item  Discrimination  Parameter  and  Item  Difficulty  Parameter  ba  for  Each 
of  the  Fifty-Five  Dichotomous  Test  Items  of  Test  Jl  of  Shiba’s  Word/Phrase 
Comprehension  Tests  Collected  for  2,259  Junior  High  School  Students. 


Discrimination  Difficulty 
Item  Parameter  Parameter 

Q  Crf  b  n 


Discrimination  Difficulty 

Item 

Parameter  Parameter 

9 

0.568 

-1.263 

0.710 

-0.809 

0.794 

-0.097 

0.495 

-0.741 

0.583 

0.205 

0.771 

-1.974 

0.386 

-0.872 

0.572 

-0.327 

0.950 

-1.266 

0.437 

-1.036 

0.508 

-1.061 

0.472 

0.486 

0.704 

-0.224 

0.303 

-1.671 

0.390 

-0.626 

0.583 

-1.573 

0.653 

-0.972 

0.293 

1.058 

0.470 

-0.904 

0.451 

-1.038 

0.456 

0.151 

0.562 

-1.313 

0.450 

-1.691 

0.367 

-0.424 

0.525 

-1.299 

0.679 

-1.094 
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FIGURE  3-1 
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Square  Roots  of  the  Test  Information  Functions  for  the  Iowa  Level  11  Vocabulary  Subtest 
(Solid  Line)  and  for  Shiba’s  Test  Jl  (Dashed  Line),  Following  the  Normal  Ogive  Model. 
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FIGURE  3-2 


Bias  Functions  of  the  Iowa  Level  11  Vocabulary  Subtest  (Solid  Line)  and  of  Shiba’s  Test 
J1  (Dashed  Line),  Following  the  Normal  Ogive  Model. 
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FIGURE  3-3 


Comparison  of  the  Square  Root  of  the  Test  Information  Function  (Solid  Line)  and  the  Bias 
Function  (Dashed  Line)  Following  the  Normal  Ogive  Model,  of  Each  of  the  Two  Tests,  i.e., 
the  Iowa  Level  II  Vocabulary  Subtest  (Upper  Graph)  and  Shiba’s  Test  Jl  (Lower  Graph). 
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FIGURE  3-4 


Square  Roots  of  the  Test  Information  Functions  for  the  Iowa  Level  11  Vocabulary  Subtest 
(Solid  Line)  and  for  Shiba’s  Test  Jl  (Dashed  Line),  Following  the  Logistic  Model. 


- IOWA  Vocabulary,  Level  1 1 

- SHIBA  Word  Test,  J1 
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FIGURE  3-5 
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FIGURE  3-6 

Comparison  of  the  Square  Root  of  the  Test  Information  Function  (Solid  Line)  and  the  Bias 
Function  (Dashed  Line)  Following  the  Logistic  Model,  of  Each  of  the  Two  Tests,  i.e.,  the 
Iowa  Level  11  Vocabulary  Subtest  (Upper  Graph)  and  Shiba’s  Test  Jl  (Lower  Graph). 


3.3  Rasch  Model 


Since  the  Rasch  model  is  a  special  case  of  the  logistic  model,  in  which  the  discrimination  parameters 
of  all  the  items  are  identical,  i.e.,  a  =  =  02  =  ■  ■  ■  =  an  ,  the  bias  function  is  obtained  by  removing 

ag  on  the  right  hand  side  of  formula  (1.4),  provided  that  the  scale  unit  of  8  be  adjusted  to  the 
common  discrimination  parameter.  Thus  all  the  observations  made  for  the  logistic  model  also  apply  for 
the  Rasch  model. 

Figure  3-7  presents  the  item  characteristic  functions  of  25  items,  each  of  which  follows  the  Rasch 
model  with  D  =  1.7  .  The  item  difficulty  parameters  of  these  items  are  equally  spaced,  starting  with 
hg  =  —3.0  and  ending  with  bg  —  +3.0  with  equal  step  widths  of  0.25  .  The  square  root  of  the 
test  information  of  this  hypothetical  test  is  shown  by  a  solid  line  in  Figure  3-8.  In  the  same  figure, 
also  presented  are  the  square  root  of  test  information  of  each  of  the  two  subtests,  i.e.  the  subtest  of 
13  items  which  is  constructed  by  taking  every  other  curve  in  Figure  3-7,  and  the  subtest  of  7  items 
obtained  by  changing  the  equal  steps  from  0.25  to  1.0  .  They  are  drawn  by  dashed  and  dotted  lines, 
respectively.  The  bias  functions  of  these  three  tests  are  shown  in  Figure  3-9  using  the  same  types  of 
lines.  In  this  figure,  unlike  the  square  root  of  test  information,  it  looks  as  if  substantial  changes  in  the 
number  of  items  did  not  affect  the  bias  functions  to  a  great  extent,  especially  for  the  range  of  9  ,  -2.0 
through  2.0  .  In  order  to  show  the  relationship  between  the  square  root  of  test  information  and  the 
bias  function  more  clearly,  Figure  3-10  presents  both  curves  together  for  each  of  the  three  hypothetical 
tests.  If  we  tolerate  biases  of  ±0.1  ,  as  we  did  before,  then  for  the  25  item  test,  the  critical  value  of  the 
square  root  of  test  information  is  approximately  2.0  ,  and  for  the  13  and  7  item  tests,  these  values 
are  approximately  1.6  and  1.1  ,  respectively. 

This  result  indicates  the  importance  of  the  configuration  of  the  item  difficulty  parameters,  i.e.,  even 
if  the  number  of  items  is  as  small  as  seven,  the  approximate  unbiasedness  can  be  reached  for  a  wide 
range  of  6  ,  provided  that  the  item  difficulty  parameters  are  distributed  evenly  for  a  wide  range. 

3.4  Three- Parameter  Logistic  Model 

In  the  three-parameter  logistic  model,  the  item  characteristic  function  is  not  point-symmetric,  as 
is  clear  from  the  formula  (1.3).  For  this  reason,  even  if  the  difficulty  parameters  of  all  the  n  items 
are  equal,  the  bias  function,  which  is  given  by  (1.4),  is  not  point-symmetric  either,  unlike  those  in 
the  normal  ogive  and  logistic  models.  Since  random  guessing  is  nothing  but  noise,  there  is  a  certain 
amount  of  decrement  in  the  accuracy  of  estimation,  especially  on  the  lower  levels  of  ability  or  latent 
trait.  Consequently,  we  must  expect  a  larger  amount  of  bias,  especially  on  the  lower  levels. 

For  the  purpose  of  illustration,  the  guessing  parameters,  0.20  and  0.25  ,  were  added  to  the 
estimated  discrimination  and  difficulty  parameters  of  each  of  the  43  test  items  of  the  Iowa  Level  11 
Vocabulary  Subtest,  which  are  shown  in  Table  3-1,  respectively,  to  create  two  more  hypothetical  tests. 
Figure  3-11  presents  the  square  roots  of  the  test  information  function  of  these  two  hypothetical  tests  by 
dashed  and  dotted  lines,  respectively,  in  comparison  with  the  one  following  the  (two-parameter)  logistic 
model,  which  is  shown  by  a  solid  fine.  We  can  see  that,  in  each  case,  the  decrement  caused  by  the 
guessing  parameters  is  substantial,  especially  on  the  lower  levels  of  ability.  The  bias  functions  of  these 
two  hypothetical  tests  are  shown  in  Figure  3-12  in  comparison  with  the  one  for  the  logistic  model,  using 
the  corresponding  types  of  lines.  It  is  obvious  that  random  guessing  causes  a  substantial  amount  of 
additional  bias,  especially  on  the  lower  levels  of  ability.  Figure  3-13  compares  the  square  root  of  test 
information  with  the  amount  of  bias  for  each  of  the  two  hypothetical  tests,  as  we  did  previously.  It 
looks  as  if  the  same  rule  held  in  these  two  cases  of  the  three-parameter  logistic  model,  i.e.,  the  amount 
of  bias  is  within  the  range  of  ±0.1  for  the  intervals  of  9  for  which  the  square  root  of  test  information 
is  approximately  1.75  or  greater,  just  as  we  observed  in  the  examples  of  the  (two-parameter)  normal 
ogive  and  logistic  models.  Such  intervals  of  8  are  substantially  smaller,  however. 

In  a  similar  manner,  two  additional  hypothetical  tests  were  created  with  cg  —  0.20  and  c.g  =  0.25  as 
the  guessing  parameters,  respectively,  added  to  the  estimated  discrimination  and  difficulty  parameters 
of  each  item  of  Shiba’s  Test  J 1 ,  which  are  shown  in  Table  3-2.  These  results  are  presented  in  Figures 
3-14  through  3-16.  We  can  see  in  these  figures  that  the  results  are  very  similar  to  the  corresponding 
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FIGURE  3-7 

Twenty-Five  Item  Characteristic  Functions  Following  Rasch  Model  with  D  =  1.7  and 
Equally  Spaced  Difficulty  Parameters  Ranging  from  -3.0  to  +3.0  . 


FIGURE  3-8 


Square  Root  of  Test  Information  of  Each  of  Three  Hypothetical  Tests  of  25  (Solid  Line), 
13  (Dashed  Line)  and  7  (Dotted  Line)  Items,  Respectively,  Following  Rasch  Model  with 
D  =  1.7  and  Equally  Spaced  Difficulty  Parameters  Ranging  from  -3.0  to  +3.0  . 
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FIGURE  3-9 


MLE  Bias  Function  of  Each  of  Three  Hypothetical  Tests  of  25  (Solid  Line),  13  (Dashed 
Line)  and  7  (Dotted  Line)  Items,  Respectively,  Following  Rasch  Model  with  D  =  1.7  and 
Equally  Spaced  Difficulty  Parameters  Ranging  from  -3.0  to  +3.0  . 
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SQ.RT.TST.INF.,  25  ITEMS,  RASCH  MODEL 
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SQ.RT.TST.INF..  13  ITEMS,  RASCH  MODEL 
MLE  BIAS,  13  ITEMS  IN  RASCH  MODEL 
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FIGURE  3-10 

MLE  Bias  Function  (Dashed  Line)  and  the  Square  Root  of  Test  Information  (Solid 
Line)  of  Each  of  Three  Hypothetical  Tests  of  25  (Solid  Line),  13  (Dashed  Line)  and  7 
(Dotted  Line)  Items,  Respectively,  Following  Rasch  Model  with  D  =  1.7  and  Equally 
Spaced  Difficulty  Parameters  Ranging  from  —3.0  to  +3.0  . 
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FIGURE  3-11 

Square  Roots  of  Test  Information  of  the  Iowa  Level  11  Vocabulary  Subtest  (Solid  Line)  of 
43  Items  Following  the  Logistic  Model,  and  of  Two  Hypothetical  Tests  of  the  Same 
Number  of  Items  Each  Following  the  Three- Parameter  Logistic  Model,  which  Share  the 
Same  Set  of  Item  Discrimination  and  Difficulty  Parameters  as  the  Iowa  Subtest  and  with 
the  Guessing  Parameters  0.20  (Dashed  Line)  and  0.25  (Dotted  Line),  Respectively. 
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FIGURE  3-12 

MLE  Bias  Functions  of  the  Iowa  Level  11  Vocabulary  Subtest  of  43  Items  Following  the 
Logistic  Model,  and  of  Two  Hypothetical  Tests  of  the  Same  Number  of  Items  Each 
Following  the  Three- Parameter  Logistic  Model,  which  Share  the  Same  Set  of  Item 
Discrimination  and  Difficulty  Parameters  as  the  Iowa  Subtest  and  with  the  Guessing 
Parameters  0.20  (Dashed  Line)  and  0.25  (Dotted  Line),  Respectively. 
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FIGURE  3-13 


MLE  Bias  Function  (Dashed  Line)  and  the  Square  Root  of  Test  Information  (Solid  Line) 
of  Each  of  the  Two  Hypothetical  Tests  of  43  Items  Each  Following  the  Three- Parameter 
Logistic  Model,  which  Share  the  Same  Set  of  Item  Discrimination  and  Difficulty  Parameters 
as  the  Iowa  Subtest  and  with  the  Guessing  Parameters  0.20  (Dashed  Line)  and  0.25 

(Dotted  Line),  Respectively. 
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- SHIBA  Word  Test  Jl ,  Case  2 
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FIGURE  3-14 

Square  Roots  of  Test  Information  of  Shiba’s  Word/Phrase  Comprehension  Test  Jl  (Solid 
Line)  of  55  Items  Following  the  Logistic  Model,  and  of  Two  Hypothetical  Tests  of 
the  Same  Number  of  Items  Each  Following  the  Three-Parameter  Logistic  Model,  which 
Share  the  Same  Set  of  Item  Discrimination  and  Difficulty  Parameters  as  Test  Jl  and 
with  the  Guessing  Parameters  0.20  (Dashed  Line)  and  0.25  (Dotted  Line),  Respectively. 
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FIGURE  3-15 

MLE  Bias  Functions  of  Shiba’s  Word/Phrase  Comprehension  Test  Jl  of  55  Items  Following  the 
Logistic  Model,  and  of  Two  Hypothetical  Tests  of  the  Same  Number  of  Items  Each  Following 
the  Three-Parameter  Logistic  Model,  which  Share  the  Same  Set  of  Item  Discrimination  and 
Difficulty  Parameters  as  Test  Jl  and  with  the  Guessing  Parameters  0.20  (Dashed  Line)  and 

0.25  (Dotted  Line),  Respectively. 
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SHIBA  Word  Test  J1 ,  Case  2,  Square  root 
SHIBA  Word  Test  J1 ,  Case  2.  MLE  Bias 


o«oo 


MLE  Bias  Function  (Dashed  Line)  and  the  Square  Root  of  Test  Information  (Solid  Line) 
of  Each  of  the  Two  Hypothetical  Tests  of  55  Items  Each  Following  the  Three- Parameter 
Logistic  Model,  which  Share  the  Same  Set  of  Item  Discrimination  and  Difficulty  Parameters 
as  Test  Jl  and  with  the  Guessing  Parameters  0.20  (Dashed  Line)  and  0.25  (Dotted  Line), 

Respectively. 


results  of  the  Iowa  Level  11  Vocabulary  Subtest,  and  similar  conclusions  can  be  reached. 


4  Graded  Response  Level 

On  the  graded  response  level,  the  bias  function  is  directly  given  by  (2.43)  by  replacing  the  general 
discrete  response  kg  to  item  g  by  the  graded  item  score  xg  (—  0, 1 . mg )  . 

In  the  homogenous  case  of  the  graded  response  level  (Samejima,  1972),  the  general  formula  for  the 
operating  characteristic  of  the  item  score  xg  is  given  by 

(4-i)  p*A0)  =  p:,(<>)-prz,+ i,M  . 

where 

?:,(<>)=  ut)dt , 

J  —  oo 

(4.3)  -  oo  =  b0  <  bi  <  b2  <  . . .  <  bmg  <  6ms  +  1  =  oo  , 

and  tpg(8)  is  some  specified  density  function.  When  we  replace  the  right  hand  side  of  (4.2)  by  that 
of  (3.1)  with  bg  replaced  by  bXg  ,  we  have  the  operating  characteristic  of  xg  in  the  normal  ogive 
model  on  the  graded  response  level;  when  we  do  a  similar  thing  by  using  the  right  hand  side  of  (1.5), 
we  obtain  the  operating  characteristic  of  xg  in  the  logistic  model  on  the  graded  response  level. 

Since,  in  general,  the  graded  item  is  more  informative  than  the  dichotomous  item,  we  can  expect 
smaller  amounts  of  bias  on  the  graded  response  level  than  on  the  dichotomous  response  level.  Although 
the  relationship  between  the  configuration  of  the  difficulty  parameters  of  the  n  items  and  the  amount 
of  bias  is  more  complicated  on  the  graded  response  level,  it  will  be  easier  in  practice  to  develop  a  set  of 
items  which  provides  us  with  negligibly  small  amounts  of  bias  for  a  wide  range  of  8  . 

We  shall  see  some  examples  here.  In  the  past  years,  the  author  has  been  engaged  in  developing 
nonparametric  approaches  and  methods  of  estimating  the  operating  characteristics,  or  the  conditional 
probabilities,  given  ability  6  ,  assigned  to  separate  discrete  item  responses.  In  other  words,  these 
approaches  and  methods  are  based  upon  no  assumptions  concerning  the  mathematical  forms  of  those 
operating  characteristics.  In  so  doing,  the  asymptotic  normal  property  of  the  maximum  likelihood 
estimate  (MLE),  i.e.,  the  fact  that,  as  the  number  of  items  increases,  the  conditional  distribution  of 
MLE,  given  8  ,  approaches  normality  with  8  and  the  inverse  of  the  square  root  of  the  test  information 
function  as  the  two  parameters,  is  fully  utilized.  A  set  of  simulated  data  has  been  used  for  testing 
these  approaches  and  methods,  in  which  35  graded  test  items  following  the  normal  ogive  model  with 
three  item  score  categories  each  are  hypothesized  as  the  Old  Test  (cf.  Samejima,  1977,  1981).  Table 
4-1  presents  the  item  discrimination  parameter  ag  and  the  two  item  response  difficulty  parameters, 
i.e.,  bIg  for  xg  =  1,  2  ,  for  each  of  the  35  hypothesized  items.  The  square  root  of  the  test  information 
function  of  this  Old  Test  is  shown  as  the  solid  curve  in  Figure  4-1.  The  bias  function,  which  was 
computed  through  (2.43),  is  shown  in  Figure  4-2  as  the  solid  curve.  We  can  see  in  this  figure  that  for 
the  interval  of  8  covering  (-4,4)  the  bias  of  the  maximum  likelihood  estimate  is  practically  zero,  i.e., 
the  MLE  of  ability  is  practically  unbiased  for  this  range  of  6  .  Thus  one  of  the  necessary  conditions 
to  justify  the  use  of  the  asymptotic  normality  as  the  approximation  for  the  conditional  distribution  of 
MLE,  given  8  ,  is  satisfied. 

We  notice  in  Figure  4-1  that  for  the  range  of  8  ,  (-3,3)  ,  the  square  root  of  the  test  information 
function  of  this  Old  Test  assumes  approximately  a  constant  value  of  4.65  ,  and  we  have  already  seen 
that  for  the  wider  range  of  8  the  bias  function  assumes,  practically,  zero.  It  is  interesting  to  note  that 
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FIGURE  4-1 


Square  Roots  of  the  Test  Information  Functions  for  the  Old  Test  of  35  Graded  Items  (Solid 
Line),  for  the  Two  Sets  of  35  Redichotomised  Items  Using  the  First  Set  (Dashed  Line)  and 
Second  Set  (Dotted  Line)  of  the  Difficulty  Parameters  of  the  Old  Test,  Respectively, 
Following  the  Normal  Ogive  Method. 
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FIGURE  4-2 


Bias  Functions  of  the  Old  Test  of  35  Graded  Items  (Solid  Line)  and  of  the  Two  Sets  of  35 
Redichotomized  Items  Using  the  First  Set  (Dashed  Line)  and  the  Second  Set  (Dotted 
Line)  of  the  Difficulty  Parameters  of  the  Old  Test,  Respectively,  Following  the  Normal 

Ogive  Model. 
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FIGURE  4-3 


Comparison  of  the  Square  Root  of  the  Test  Information  Function  (Solid  Line)  and  the  Bias 
Function  (Dashed  Line)  of  Each  of  the  Three  Tests,  i.e.,  the  Old  Test  (First  Graph)  and 
the  Two  Sets  of  Redichotomiied  Items  Using  the  First  (Second  Graph)  and  the  Second 
(Third  Graph)  Sets  of  Difficulty  Parameters,  Respectively. 


the  bias  starts  showing  up  both  positively  and  negatively  when  the  square  root  of  test  infomation  drops 
lower  than  a  critical  value,  which  is  approximately  3.2  >  or  the  test  information  function  drops  lower 
than  approximately  10  .  In  order  to  pursue  this  relationship,  two  more  sets  of  these  two  functions 
are  also  shown  by  dashed  and  dotted  curves  in  Figures  4-1  and  4-2.  These  two  sets  were  created 
by  redichotomizing  the  graded  items  of  the  Old  Test,  using  the  first  and  second  sets  of  the  difficulty 
parameters  in  Table  4-1,  respectively.  We  can  see  that  for  the  wide  range  of  6  the  square  root  of  test 
information  is  substantially  less  than  that  of  the  original  Old  Test,  which  is  the  natural  consequence  of 
redichotomizing  the  items.  It  is  noticed  that  for  each  of  these  two,  the  square  root  of  the  test  information 
function  is  barely  greater  than  3.2  for  a  wide  range  of  d  ,  and  the  bias  is  still  practically  nil.  Again 
the  bias  appears  both  positively  and  negatively  when  the  square  root  of  the  test  information  function 
drops  lower  than  approximately  3.2  .  In  order  to  make  this  observation  easier,  both  the  square  root  of 
the  test  information  function  and  the  bias  function  are  plotted  together  in  Figure  4-3  for  each  of  the 
three  hypothetical  tests,  by  solid  and  dashed  lines,  respectively.  If  we  tolerate  biases  of  up  to  ±0.1  ,  as 
we  did  earlier,  then  the  critical  value  of  the  square  root  of  test  information  will  approximately  be  2.75  , 
or  that  of  the  test  information  function  approximately  7.5  .  When  the  square  root  of  test  information 
drops  to  less  than  2.0  ,  the  bias  turns  out  to  be  substantially  large. 

It  is  interesting  to  note  that,  in  these  examples,  the  amount  of  information  required  to  make  the 
bias  negligibly  small  is  larger  than  those  observed  in  the  previous  examples,  i.e. ,  those  of  Iowa  Level 
11  Vocabulary  Subtest  and  Shiba’s  Test  Jl.  This  has  something  to  do  with  the  fact  that  in  the  Old 
Test  there  are  only  35  test  items  with  the  average  discrimination  parameter  as  high  as  1.70  ,  while 
there  are  as  many  as  43  and  55  items  in  the  Iowa  Level  11  Vocabulary  Subtest  and  Shiba’s  Jl  Test  with 
the  average  values  of  discrimination  parameters  0.601  and  0.538  ,  respectively.  We  shall  investigate 
the  effect  of  discrimination  parameters  on  the  amount  of  bias  from  a  somewhat  different  angle  in  the 
following  section. 

5  Effect  of  the  Discrimination  Parameter 

In  the  previous  sections,  we  have  seen  from  several  examples  that,  if  the  amount  of  test  information 
is  substantially  large,  the  amount  of  bias  is  negligibly  small,  and  it  looks  as  if  there  were  a  critical  value  of 
the  square  root  of  test  information  to  realize  this  approximate  unbiasedness  of  the  maximum  likelihood 
estimate.  This  critical  value  differs  from  test  to  test,  however,  as  we  have  observed  in  the  examples 
of  the  Old  Test,  the  Iowa  Level  11  Vocabulary  Subtest  and  Shiba’s  Test  Jl.  On  the  dichotomous 
response  level,  the  effect  of  the  configuration  of  the  difficulty  parameters  in  a  test  can  be  seen  fairly 
straightforwardly  from  the  formulae  of  the  bias  function  for  different  mathematical  models,  as  we  have 
observed  earlier.  In  this  section,  we  shall  pursue  the  effect  of  the  discrimination  parameters  on  the 
bias  function.  In  so  doing,  we  choose  tests  of  equivalent  items,  i.e.,  each  of  which  consists  of  test  items 
whose  item  characteristic  functions  are  identical.  We  have  seen  earlier  that  in  such  a  case,  a  substantial 
amount  of  bias  starts  showing  up  as  9  departs  from  the  common  value  of  difficulty  parameters. 

In  order  to  simplify  the  notation,  in  this  section,  we  use  P  ,  Q  ,  Ig  ,  \p  and  4>  to  indicate  Pg{8)  , 

Q„(S)  .  U*)  •  *„(*)  and 


(5.1) 


*  =  *„{a(0-6)}  =  (27r)-l/V'*  '2 


which  are  common  for  all  the  n  items,  where  a  =  ax  =  ...  =  an  and  6  =  =  . .  .  =  bn  .  In  the 

normal  ogive  model,  (3.4)  can  be  simplified  for  the  set  of  n  equivalent  items  so  that  we  obtain 

B[&)  =  (1/2 )PQ[9  -  fc)n-‘<r2  , 


(5.2) 


because  of  (1.6),  (1.7)  and  (3.2).  We  have  for  the  partial  derivative  of  B{9)  with  respect  to  a 


(5.3) 


j^B(6)  =  (2n)_1  (#  -  b)4>~*[<j>2-^-(PQ)  -  2 <t>PQyJ\  ■ 


By  virtue  of  the  fact  that 

(5.4) 

and 

(5.5) 


da 


(PQ)  =  4>(0  -  b)(Q  -  P) 


—4>  -  <j>\-a(8  -  b)2}  , 


we  can  write  for  the  last  factor  of  the  right  hand  side  of  (5.3) 

(5.6)  *2£(pq)  -  2(t>pQj-J  =  2*2(d  ~  b)pQH&  ~ b)  ~ 


We  notice  that  the  second  term  on  the  parenthesis  on  the  right  hand  side  of  (5.6)  equals 
\\E\u  |  u  <  a(9  -  fc)|  +  E\u  |  u  >  a(9  -  i)j]  ,  since  we  have 


(5.7) 


,a(0-fc) 

E{u  |  u  <  a(9  —  6)]  =  /  ui j>(u)  du/ P 

J  —  OO 

=  -HP . 

=  !-*(«)  \a-lL~b)]/P 


and 


(5.8) 


itju  |  u  >  a(0  —  6))  =  f  u</>(u)  du/Q 

J  a[0  —  b) 

=  b*(«)  l~*-6  )}/Q 
=  HQ  ■ 


It  is  obvious  that  this  average  of  the  two  expectations  of  u  equals  zero  when  9  =  b  and  assumes 
negative  and  positive  values  when  9  <  b  and  9  >  b  ,  respectively.  In  addition,  we  obtain 


(5.9) 


2 1  p  Q* 


>  a{9  —  b)  9  <b 
=  a{9  -b)  9  =  b 

<  a(9  -  b)  9  >  b 


To  prove  this,  since  we  can  write  for  the  item  information  function  in  the  normal  ogive  model 
(5.10)  /„  =  a2<f>2(PQ)~l  , 
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its  first  derivative  /'  with  respect  to  8  is  given  by 

(5.11)  /i  =  2a/B[I{_i  +  i}_a(«_fe)]  . 

Setting  (5.11)  equal  to  sero,  we  obtain  8  =  b  .  FYom  (5. 11), we  can  see  that  the  second  derivative  /"  of 
the  item  information  function  with  respect  to  8  assumes  4a3*--3  (2  —  ir)  at  8  =  b  ,  which  is  negative. 
Thus  8  =  b  is  the  point  of  8  at  which  Ig  is  maximal,  and  /'  assumes  positive  values  for  8  <  b 
and  negative  values  for  8  >  b  .  Equation  (5.9)  is  the  direct  consequence  of  this  fact. 

Figure  5-1  presents  the  square  root  of  the  test  information  function  for  each  of  the  five  examples 
following  the  normal  ogive  model  by  a  solid  line  and  dashed  lines  of  various  lengths.  In  these  five 
examples,  the  numbers  of  equivalent  items  are  uniformly  30,  and  the  common  values  of  the  difficulty 
parameter  are  all  0.0  .  The  common  values  of  the  discrimination  parameter  differ  for  different  tests, 
however,  i.e.,  they  assume  0.4  ,  0.7  ,  1.0  ,  1.5  and  2.0  ,  respectively.  The  five  bias  functions  for 
these  five  hypothetical  tests  are  shown  in  Figure  5*2,  using  the  same  set  of  solid  and  dashed  lines. 

We  can  see  in  these  figures  how  rapidly  the  amount  of  bias  increases  when  the  common  discrimination 
parameter  is  large,  in  both  negative  and  positive  directions  as  8  departs  from  sero,  at  which  the  square 
root  of  test  information  is  maximal,  especially  when  ag  =  2.0  .  It  is  also  noted  that,  taking  the  criterion 
of  ±0.1  again,  in  order  to  keep  the  practical  unbiasedness  the  square  root  of  test  information  must  be 
as  large  as  4.0  when  ag  =  2.0  ,  while  it  can  be  as  small  as  1.3  when  ag  =  0.4  .  For  the  intermediate 
values  of  the  discrimination  parameter,  i.e.,  for  ag  =  0.7,  1.0,  1.5  ,  the  corresponding  criterion  values 
of  the  square  root  of  test  information  are  approximately  2.0  ,  2.5  and  3.0  ,  respectively. 

In  the  logistic  model,  we  can  rewrite  (1.4)  for  the  set  of  n  equivalent  items  to  obtain 

(5.12)  B(8)  =  Dnalg{9  -  ±HnIg]~3 

=  {¥  -  ±)[nZ>o*(l  -  ¥)]  , 
by  virtue  of  (1.3),  (1.5),  (1.6)  and  the  L-ct  that 

(5.13)  *;(*)  =  Do,¥#(*)[l  -  ¥,(*)]  , 

where  ^(,(^)  indicates  the  first  derivative  of  ^B(S)  with  respect  to  8  .  Since  we  have 

(5.14)  «  D(t  -  bg)9g(8)ll  -  *,(*)]  , 

the  numerator  of  the  partial  derivative  of  B(8)  with  respect  to  a  can  be  written  as 

(5.15)  n£>3a¥3[l  -  ¥]3(0  -  6)  -  [*  -  i|nZ>3a»[l  -  tf)[l  -  2tf](0  -  6) 

It 

=  n£>2o^[  1  -  *][i  -  (1  -  ¥}](*  -  6). 

Since  all  the  factors  on  the  right  hand  side  of  (5.15)  are  positive  except  for  the  last  one,  we  can  conclude 
that  the  amount  of  bias  equals  zero  at  8  =  b  regardless  of  the  value  of  a  ,  and  increases  in  the  positive 
and  negative  directions  for  6  >  b  and  8  <  6  ,  respectively. 

Figures  5-3  and  5-4  present  the  square  root  of  the  test  information  function  and  the  bias  function 
for  each  of  the  five  hypothetical  tests  following  the  logistic  model,  respectively,  which  share  the  same 
number  of  items  and  the  parameter  values  as  those  five  hypothetical  tests  following  the  normal  ogive 
model.  These  results  are  very  similar  to  those  obtained  for  the  normal  ogive  model,  except  for  the  fact 
that  the  intervals  of  practical  unbiasedness  are  a  little  smaller. 
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FIGURE  5-1 

Square  Roots  of  Test  Information  for  the  Five  Hypothetical  Tests  of  30  Equivalent  Items 
Following  the  Normal  Ogive  Model.  The  Common  Values  of  the  Difficulty  Parameter 
Are  Uniformly  0.0  ,  and  Those  of  the  Discrimination  Parameter  are  0.4  ,  0.7  ,  1.0  , 

1.5  ,  and  2.0  ,  Respectively. 
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FIGURE  5-2 


Bias  Functions  for  the  Five  Hypothetical  Tests  of  30  Equivalent  Items  Following  the 
Normal  Ogive  Model.  The  Common  Values  of  the  Difficulty  Parameter  Are 
Uniformly  0.0  ,  and  Those  of  the  Discrimination  Parameter  are  0.4  ,  0.7  , 

1.0  ,  1.5  ,  and  2.0  ,  Respectively. 
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FIGURE  5-3 

Square  Roots  of  Test  Information  for  the  Five  Hypothetical  Tests  of  30  Equivalent  Items 
Following  the  Logistic  Model.  The  Common  Values  of  the  Difficulty  Parameter  Are 
Uniformly  0.0  ,  and  Those  of  the  Discrimination  Parameter  are  0.4  ,  0.7  ,  1.0  , 

1.5  ,  and  2.0  ,  Respectively. 
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FIGURE  5-4 


Bias  Functions  for  the  Five  Hypothetical  Tests  of  30  Equivalent  Items  Following  the 
Logistic  Model.  The  Common  Values  of  the  Difficulty  Parameter  Are  Uniformly  0.0 
and  Those  of  the  Discrimination  Parameter  are  0.4  ,  0.7  ,  1.0  ,  1.5  and  2.0, 

Respectively. 


6  Effect  of  the  Number  of  Items 


It  is  obvious  from  (1.7),  (2.43)  and  (2.46)  that  the  number  of  items  in  a  test  affects  the  amount 
of  bias  through  the  test  information  function,  in  the  negative  way  such  that  its  increase  causes  the 
decrease  in  the  amount  of  bias.  We  have  also  observed  from  our  examples  that,  even  if  the  amounts 
of  test  information  are  the  same  for  two  different  tests,  they  may  not  share  the  same  amount  of  bias. 
In  this  regard,  we  have  seen  that  the  values  of  the  item  discrimination  parameters  affect  the  amount 
of  bias  in  the  positive  direction.  It  has  also  been  pointed  out  that  the  configuration  of  the  difficulty 
parameters  in  a  test  affects  the  bias  function. 


In  order  to  demonstrate  these  effects  further,  in  this  section,  we  shall  observe  the  effect  of  the  number 
of  items  using  different  numbers  of  equivalent  items,  each  of  which  follows  the  constant  information 
model  (Samejima,  1979)  on  the  dichotomous  response  level.  The  item  characteristic  function  in  the 
constant  information  model  is  defined  by 


Pa{6)  =sin2{oa(0-^)  +  (x/4)} 


where  ag  and  pg  are  the  item  discrimination  and  difficulty  parameters,  respectively.  FYom  (6.1)  we 
obtain 


Qa(&)  =  cos2{ae(0  -  Pg)  +  (x/4)}  . 


This  model  provides  us  with  a  constant  amount  of  item  information,  i.e., 


for  the  interval  of  8  such  that 


-  *-(4a„)  1  +  09  <  9  <  ’r(4a9)  1  +  Pg  ■ 


Since  we  have 


P'(0)  =  2  a.IP.m.M)172 


P?{«)  =  2 a]\Qg{8)  -  Pg(9) } 


substituting  these  and  (6.3)  into  (2.46)  and  rearranging,  we  can  write  for  the  bias  function  in  the 
constant  information  model 


B(8)  =  2{I(6)}-*J2a3g{Pg(8)  -  Qg(8)}{Pgmgm~l/ 


For  the  set  on  n  equivalent  items,  we  can  simplify  (6.7)  into  the  form 


B(8)  =  {8na„}-l{Fu(e)  -  Qg{e)){Pg(8)Qg[9)Y 
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FIGURE  6-1 

MLE  Bias  Function  for  Each  of  the  Twenty  Different  Values  of  n  ,  Which  Starts  with  10  , 
Increases  by  10  Successively,  and  Ends  with  200  .  The  Common  Item  Parameters  for 
These  Equivalent  Items  Are  Given  by  aa  =  0.25  and  0g  =  0.00  ,  and,  Therefore,  in  Each 
Set  of  Equivalent  Items,  the  Test  Information  Function  Assumes  a  Constant  Value,  0.25n  , 

for  the  Range  of  6  ,  —ir<9<n. 
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Figure  6-1  presents,  in  two  graphs,  this  bias  function  for  each  of  the  twenty  different  values  of 
n  ,  which  starts  with  10  ,  increases  by  10  successively,  and  ends  with  200  .  The  common  item 
parameters  for  these  equivalent  items  are  given  by  ag  =  0.25  and  f)g  =  0.00  ,  and,  therefore,  in  each 
set  of  equivalent  items,  the  test  information  function  assumes  a  constant  value  of  0.25n  for  the  range 
of  9  ,  —  *•  <  9  <  r  .  We  can  see  in  this  figure  that  the  amount  of  bias  is  substantial  as  9  departs  from 
Pg  =  0.00  when  n  is  relatively  small,  but  it  becomes  negligibly  small  for  larger  values  of  n  ,  especially 
when  n  exceeds  100  .  These  results  also  illustrate  the  fact  that  the  amount  of  test  information 
alone  does  not  control  the  amount  of  bias,  since  they  are  based  upon  the  constant  information  model, 
and  the  amount  of  test  information  is  a  constant  in  each  set  of  equivalent  items  for  the  range  of  9  , 
—  ?r  <  9  <  n  .  A  usefulness  of  the  constant  information  model  is  that  we  can  use  it  as  the  benchmark 
when  we  deal  with  a  set  of  equivalent  items  following  such  frequently  used  mathematical  models  as  the 
:  ormal  ogive  model,  the  logistic  model,  Rasch  model,  etc.  This  will  be  shown  with  respect  to  the  bias 
function  in  the  following  section. 

7  Scale  Transformation 

It  is  obvious  from  (2.43)  that  the  bias  function  belongs  to  a  particular  scale  of  the  latent  trait, 
and,  if  the  scale  is  transformed,  the  function  changes  also.  In  this  section,  we  shall  see  how  the  scale 
transformation  affects  the  bias  function  of  a  test. 

7.1  General  Case  of  Discrete  Responses 

Let  r  be  a  strictly  increasing  transformation  of  9  ,  so  that  we  can  write 
(7.1)  r  =  r(9)  . 

We  assume  that  r  is  twice  differentiable  with  respect  to  9  .  Since  the  operating  characteristic,  or  the 
conditional  probability,  of  the  discrete  item  response  kg  is  unchanged  for  the  scale  transformation,  we 
can  write 

(7-2)  PZ,{r)  =  Pk,(0)  , 

where  (r)  denotes  the  operating  characteristic  of  the  discrete  item  response  kg  as  a  function  of 
the  transformed  latent  trait  r  .  From  (7.2)  we  obtain  for  the  first  and  second  derivatives,  Pfc*'(r)  and 
Pfc‘"(r)  ,  of  the  operating  characteristic  of  kg  with  respect  to  r 

(7-3)  P?'{r)  =  rkw(0)^ 


(7-4)  W  =  ■ 

From  (7.3),  (7.4)  and  the  definitions  of  the  item  response  information  function  and  of  the  item  infor¬ 
mation  function,  which  are  given  by  (1.1)  and  (1.2),  respectively,  we  can  write  for  the  item  information 
function  /J  ( r )  of  item  g  as  a  function  of  the  transformed  latent  trait  r 

(7.5)  w-wifi3  • 


I 


mSBBB&SBSBgt 


Hence  we  have  for  the  test  information  function  /*(r)  on  the  transformed  scale  of  the  latent  trait 

(7.6)  no  =  Ew  =  E/Bi  f)2 

9=i  0=1 

=  mi£r i2  • 

We  can  write  from  (2.43)  for  the  bias  function,  B*(t)  ,  of  a  test  after  the  scale  transformation 
(7-7)  B*(r)  =  —[Hr)]-2  • 

17=1  k , 

Substituting  (7.2),  (7.3),  (7.4),  and  (7.6)  into  (7.7)  and  rearranging,  we  obtain 

p-»i  b-w  -  B(«)[|r‘  -  ^mnfr \-’%  • 

The  bias  function  of  the  transformed  latent  variable  can  be  computed,  therefore,  from  the  original  bias 
function,  the  original  test  information  function  and  the  first  and  second  derivatives  of  9  with  respect 
to  t  . 


7.2  Scale  Transformation  to  Generate  a  Constant  Test  Information 

In  a  nonparametric  approach  for  estimating  the  operating  characteristics  of  discrete  item  responses, 
the  latent  trait  9  is  transformed  to  r  in  such  a  way  that  the  resultant  test  information  function  /*(r) 
assumes  a  constant  value  for  the  interval  of  r  of  interest,  in  which  the  ability  levels  of  most  subjects 
are  included  (cf.  Samejima,  1981).  In  this  way,  the  approximation  of  the  conditional  distribution  of  the 
maximum  likelihood  estimate  f  ,  given  r  ,  by  a  normal  distribution  with  the  parameters  r  and  a  , 
the  latter  of  which  does  not  depend  upon  r  ,  becomes  more  justifiable.  Thus  we  can  write 

(7.9)  7*(r)  =C2  ,  C>0  . 


Substituting  this  into  (7.6)  and  rearranging,  we  obtain  for  the  first  derivative  of  9 

(7.10)  ^  =  c[/(*)ri/2 . 

The  transformation  of  the  latent  trait  9  to  r  is  given  by 

re 

(7.11)  r  =  r[9)  =  C~ 1  /  [7(f)] 1/2  dt  +  S, 

J  -  oo 

where  <5  indicates  a  constant  which  determines  the  origin  of  r  .  From  (7.10)  we  have  for  the  second 
derivative  of  9  with  respect  to  r 

(7.12)  ^  =  _I<72|/(0)]-2/'(0)  , 

where  I' (9)  indicates  the  first  derivative  of  the  test  information  function  1(9)  with  respect  to  6  , 
for  which  we  can  write 

m  =  - (n,(m2HM*)r2  • 

(7=1 
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(7.13) 


Substituting  (7.12)  and  (7.13)  into  (7.8)  and  rearranging,  we  obtain 

(7.14)  B*(r)  =  B(*)CTl[I(B)\ll*  ±  -C-l\I(9))~3l2 1'(6)  . 

4 

Thus  in  this  specific  situation  (7.8)  is  simplified  to  include  only  the  original  bias  function,  the  original 
test  information  function  and  its  derivative,  and  the  constant  square  root  of  test  information  after  the 
latent  variable  has  been  transformed. 

Let  K(9)  denote  the  square  root  of  the  test  information  function  1(9)  .  Thus  we  can  write 

(7.15)  K(9)  =  1(9) . 

Since  we  have 

(7.16)  I'(9)  =  2K(9)K'(9)  , 

where  K'(9)  is  the  derivative  of  K(9)  with  respect  to  9  ,  we  can  rewrite  (7.14)  in  the  form 

(7.17)  B‘(t)  =  B(9)C-lK(9)  +  ^C-l\K(9)\-2K'(9)  . 

We  notice  from  (7.17)  that,  in  a  typical  situation  where  the  square  root  of  test  information  is  unimodal, 
as  is  exemplified  by  those  functions  obtained  for  the  Iowa  Level  1  Vocabulary  Subtest  and  Shiba’s 
Word/Phrase  Comprehension  Test  J 1 ,  which  are  shown  in  Figure  3-1,  the  amount  of  bias  is  decreased 
by  the  transformation  of  9  to  r  for  extreme  values  of  the  transformed  latent  variable  where  there 
used  to  be  substantial  amounts  of  bias  either  in  negative  or  positive  direction  (cf.  Figure  3-2).  If  we  set 
the  value  of  C  in  such  a  way  that,  for  a  meaningful  interval  of  9  ,  the  values  of  the  two  endpoints  are 
practically  unchanged  over  the  transformation  from  9  to  r  in  order  to  avoid  overall  radical  changes 
of  scale  values,  then  the  factor  C~1K(9)  by  which  the  original  bias  function  B(9)  is  multiplied,  takes 
on  values  less  than  unity  for  extreme  values  of  r  ,  causing  reduction  in  the  amount  of  bias.  In  addition 
to  this  fact,  the  second  term  of  the  right  hand  side  of  (7.17)  further  reduces  negative  biases  as  we  go 
toward  extremely  lower  levels  of  the  transformed  scale  and  decreases  positive  biases  as  we  go  toward  the 
other  extreme,  for  K'(9)  is  positive  on  lower  levels  of  r  and  negative  on  higher  levels,  respectively, 
and  K(9)  assumes  small  positive  values  on  both. 

Figure  7-1  presents  the  constant  square  root  C  of  test  information  7*(r)  by  dashed  lines,  in 
comparison  with  the  square  root  of  the  original  test  information  function  1(9)  which  is  drawn  by  a 
solid  line,  of  the  Iowa  Level  11  Vocabulary  Subtest.  The  original  square  root  of  the  test  information 
function  is  based  upon  the  normal  ogive  model  and  has  already  been  shown  in  Figure  3-1.  The  values 
of  C  and  &  in  (7.12)  were  chosen  in  such  a  way  that  r  is  set  equal  to  9  at  9  =  ±4.0  .  In  this  way, 
we  can  avoid  radical  changes  between  the  two  sets  of  scale  values.  As  the  result,  the  constant  square 
root  of  test  information  C  turned  out  to  be  approximately  2.22617674  .  We  can  see  in  Figure  7-1 
that  for  the  interval,  (  —  4.0,  4.0)  ,  the  areas  under  the  two  square  roots  of  test  information  are  equal. 
The  resulting  bias  function  B'(r)  is  shown  in  Figure  7-2  by  a  dashed  line,  in  comparison  with  the 
original  bias  function  B(9)  .  We  can  see  a  substantial  decrease  in  the  amount  of  bias  caused  by  the 
scale  transformation. 

Figures  7-3  and  7-4  present  the  corresponding  results  for  Shiba’s  Word/Phrase  Comprehension  Test 
Jl.  The  transformation  of  9  to  r  was  made  following  the  same  strategy  that  was  used  for  the  Iowa 
Subtest.  The  resultant  constant  square  root  of  test  information  C  is  approximately  2.39633860  .  We 
can  see  in  this  result  that,  after  the  scale  transformation,  the  maximum  likelihood  estimate  of  r  is 
practically  unbiased,  if  we  accept  the  criterion  of  ±0.1  as  we  did  before. 
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FIGURE  7-1 

Square  Roots  of  Test  Information  of  the  Iowa  Level  11  Vocabulary  Subtest  Before  (Solid 
Line)  and  After  (Dashed  Line)  the  Scale  Transformation.  Transformation  is  made  in  such 
a  way  that  the  two  scales  match  at  -4.0  and  +4.0  . 
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FIGURE  7-2 


MLE  Bias  Function  of  the  Iowa  Level  11  Vocabulary  Subtest  as  a  Function  of  the  Transformed 
Latent  Variable  r  (Dashed  Line)  in  Comparison  to  the  Original  MLE  Bias  Function  (Solid 

Line)  of  6  . 
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FIGURE  7-3 


Square  Roots  of  Test  Information  of  Shiba’s  Word/Phrase  Comprehension  Test  Jl  Before 
(Solid  Line)  and  After  (Dashed  Line)  the  Scale  Transformation.  Transformation  is  made  in 
such  a  way  that  the  two  scales  match  at  -4.0  and  +4.0  . 
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FIGURE  7-4 

MLE  Bias  Function  of  Shiba’s  Word/Phrase  Comprehension  Test  J1  as  a  Function  of  the 
Transformed  Latent  Variable  r  (Dashed  Line)  in  Comparison  to  the  Original  MLE 
Bias  Function  (Solid  Line)  of  8  .  Transformation  of  8  to  r  Is  Made  by  a  Polynomial 

Approximation. 


It  has  been  demonstrated  that  the  square  root  of  the  test  information  can  be  approximated  by  a 
polynomial  of  a  suitable  degree  obtained  by  the  method  of  moments,  which  proves  to  be  also  the  least 
square  solution  (cf.  Samejima  and  Livingston,  1979).  It  has  also  been  shown  that  with  many  different 
sets  of  data  such  approximations  have  worked  well  (e.g.  Samejima,  1981,  1984a).  For  the  purpose  of 
illustration,  the  polynomial  approximation  was  used  with  the  Iowa  Level  11  Vocabulary  Subtest,  and 
the  resulting  scale  transformation  is  given  by 


r  =  0.3777014  +  1.43011205  -  0.052885452  -  0.040809653  +  0.002940454 

+  0.00110375s  -  0.000085856  -  0.000014657  +  0.000001058 


In  this  scale  transformation,  the  same  strategy  was  taken  as  before,  so  that  r(5)  =  5  at  5  =  ±4.0  . 
The  constant  square  root  of  the  test  information  function  of  r  turned  out  to  be  2.231709  ,  which 
is  very  close  to  the  corresponding  value  of  2.22617674,  which  was  obtained  without  the  polynomial 
approximation.  The  bias  function  fi*(r)  thus  obtained  is  shown  in  Figure  7-5  by  a  dashed  line,  in 
comparison  with  the  original  5(5)  which  is  drawn  by  a  solid  line.  We  can  see  that  this  result  is 
practically  identical  with  the  one  obtained  without  the  polynomial  approximation,  which  is  shown  in 
Figure  7-2. 


Figure  7-6  presents  the  three  separate  scale  transformations  of  the  Iowa  Level  11  Vocabulary  Subtest, 
of  Shiba’s  Test  Jl  and  of  the  Iowa  Subtest  with  the  polynomial  approximation,  by  solid,  dashed  and 
dotted  lines,  respectively.  Actually,  we  can  only  see  two  curves,  for  the  dotted  curve  practically  coincides 
with  the  solid  curve. 


Equivalent  Items  on  the  Dichotomous  Response  Level 


We  have  seen  in  a  previous  section  how  the  amount  of  bias  decreases  as  the  number  of  items  increases, 
using  the  example  of  equivalent  items  on  the  dichotomous  response  level,  which  follow  the  constant 
information  model.  It  should  be  noted  that  the  corresponding  set  of  bias  functions  for  equivalent  items 
following  any  mathematical  model,  which  provides  us  with  a  strictly  increasing  item  characteristic 
function  with  zero  and  unity  as  its  two  asymptotes,  can  be  produced  from  these  results  by  an  appropriate 
strictly  increasing  scale  transformation.  Let  r  =  r(5)  be  such  a  transformation  of  5  ,  and  /^(r) 
denote  the  item  characteristic  function  following  one  of  such  models.  Setting 


P'Ar)  =  P9{6) 


we  obtain 
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(7.21) 
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FIGURE  7-5 

MLE  Bias  Function  of  the  Iowa  Level  11  Vocabulary  Subtest  as  a  Function  of  the  Transformed 
Latent  Variable  r  (Dashed  Line)  in  Comparison  to  the  Original  MLE  Bias  Function  (Solid 
Line)  of  d  .  Transformation  of  9  to  r  Is  Made  by  a  Polynomial  Approximation. 
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FIGURE  7-6 

Transformation  of  9  to  r  Based  Upon  The  Iowa  Level  11  Vocabulary  Subtest  (Solid  Line), 
Upon  Shiba’s  Word/Phrase  Comprehension  Test  Jl  (Dashed  Line),  and  Upon  the  Iowa  Level 
11  Vocabulary  Subtest  Using  the  Polynomial  Approximation. 
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where  Ptf*(r)  and  /*  (r)  indicate  the  item  characteristic  function  and  the  item  information  function, 
respectively,  after  the  scale  transformation,  and  P^'(r)  and  Pg''(T)  denote  the  first  and  second 
derivatives  of  Pg  (r)  with  respect  to  r  .  Since  we  can  write  for  a  set  of  n  equivalent  items 
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B'( r)  =  (-2n)-‘P;(r)Q;(r)P;"(r){P;'"(r)}-3  , 


where  B*(r)  is  the  bias  function  after  the  scale  transformation,  we  obtain 
(7-24)  B'(r)  =  B(e){fr r1  -  }-30  ■ 


For  the  purpose  of  illustration,  let  us  consider  the  scale  transformation  which  changes  the  constant 
information  model  to  the  logistic  model.  Thus  we  have 

(7.25)  P*(r)  =  {1  +  e-0a»(r_6»*}_1  . 


The  functional  relationship  between  8  and  r  is  given  by 


r  =  (Dag)  1  log[tan2{aa(0  —  f)g)  +  (ir/ 4)}]  4-  bg  , 


(7.27)  8  =  a;1[tan-1{e(1/2>I>0»(T-,>»)}  -  (tt/4))  +  Pg 
The  first  and  second  derivatives  of  8  with  respect  to  r  are  given  by 

(7.28)  ^  =  Dag{2ag)-'{Pg(8)Qa(8)yl 2 


(7-29)  —2  =  X?2a2(4ae)-1{Pe(S)O9(^)}1/2{gB(0)  -  P8(0)}  , 

respectively.  Thus  we  can  write  from  (7.19),  (7.20),  (7.21),  (7.22),  (7.28),  and  (7.29) 
(7  30)  P*'(r)  =  DagPg{8)Qg{8)  =  DagP;(r)Q;(r)  , 


P;"(r)  =  D*a>gPg(8)QA9){QA8)  ~  P9(<>)) 

=  Z?2a2P;(r)Q;(r)(Q;(r)-P;(r)} 


W  =  Z?2a2P„(0)g9(^)  =  Z?2a2P;(r)g;(r)  , 
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where 


(7-33)  Qg[T)  =  1  —  Pg(T)  • 

We  can  easily  see  that  these  results  are  agreeable  with  those  obtained  directly  from  (7.25).  For  the  bias 
function  £*(r)  ,  we  have  from  (7.24),  (7.28),  and  (7.29) 

(7.34)  £*(r)  =  (2 nDaa)^{Pg(9)  -  Qg(8)}{Pg(9)Qg (O)}"1 

=  (n£a9)-t{P9*(r)g;(r)}-1{P;(r)  -  (1/2)}  . 

We  can  see  that  (7.34)  is  a  special  case  of  (1.4)  when  all  the  n  items  are  equivalent,  by  replacing  6  by 
r  and  ^(fl)  by  P’(t)  . 

8  Adaptive  Testing 

Observations  made  in  previous  sections  provide  us  with  ideas  how  things  go  in  adaptive  testing. 
First  of  all,  in  order  to  reach  the  practical  unbiasedness  in  estimating  the  individual  subject’s  ability  in 
adaptive  testing,  we  need  to  make  sure  that  a  sufficient  amount  of  test  information  has  been  reached  for 
each  individual  subject,  before  terminating  the  presentation  of  new  items.  We  can  control  it  easily,  if  we 
use  the  amount  of  test  information  as  the  criterion  for  the  termination  of  presenting  new  items,  or  the 
stopping  rule.  If  the  items  follow  the  normal  ogive  or  logistic  model  in  the  adaptive  testing  situation,  for 
subjects  of  intermediate  ability  levels  it  is  likely  that  on  the  initial  stage  the  item  difficulty  parameters 
fluctuate  both  negatively  and  positively  around  the  subject’s  true  ability  level,  and  consequently,  the 
biases  of  negative  and  positive  directions  are  cancelled  out,  since  an  item  pool  usually  has  plenty  of 
items  of  intermediate  difficulties.  In  such  a  case,  we  do  not  have  to  worry  too  much  about  the  influence 
of  initial  items  on  the  eventual  bias  of  the  ability  estimate.  When  the  maximum  likelihood  estimate 
has  started  being  more  or  less  stabilized,  chances  are  slim  that  the  additional  item  causes  a  substantial 
bias,  provided  that  the  program  is  written  in  such  a  way  that  an  item  of  a  large  amount  of  information 
at  the  current  estimated  ability  level  will  be  presented  next,  and  that  the  item  pool  has  a  sufficient 
number  of  items  whose  difficulty  levels  are  around  the  subject’s  true  ability  level.  There  is  a  greater 
possibility  that  the  examinee  obtains  a  biased  ability  estimate  if  his  ability  level  is  close  to  either  end 
of  the  configuraiton  of  difficulty  parameters,  since  biases  caused  by  the  initially  presented  items  are  not 
likely  to  cancel  themselves  out,  and,  moreover,  there  may  not  be  a  sufficient  number  of  items  whose 
difficulty  levels  are  close  to  his  true  ability  level. 

If  the  item  pool  consists  of  items  following  the  three-parameter  normal  ogive  or  logistic  model,  the 
effect  of  random  guessing  on  the  amount  of  bias  can  be  substantial,  especially  on  the  lower  levels  of 
ability.  In  such  a  case,  it  is  imperative  to  include  many  easy  items  in  the  item  pool. 

In  any  case,  the  bias  function  can  be  a  good  indicator  in  evaluating  the  item  pool,  if  we  use  it 
wisely  and  effectively.  Those  results  that  were  described  in  previous  sections  will  give  us  information 
and  suggestions  as  to  how  to  improve  an  existing  item  pool. 


9  Discussion  and  Conclusions 

The  bias  function  of  the  maximum  likelihood  estimate  has  been  proposed  for  the  general  discrete 
response  level,  which  includes  Lord’s  bias  function  in  the  three-parameter  logistic  model  as  a  special 
case.  The  function  has  also  been  observed  both  on  the  dichotomous  and  graded  response  levels,  with 
respect  to  various  mathematical  models.  Effects  of  the  item  discrimination  parameters,  of  the  item 
difficulty  parameters,  and  of  the  number  of  items  have  also  been  observed.  Local  changes  in  the  amount 
of  bias  caused  by  the  scale  transformation  have  also  been  observed  from  various  different  angles,  and  it 
has  also  been  discussed  in  the  context  of  adaptive  testing. 


Since  the  local  unbiasedness  is  important,  the  proposed  function  will  find  its  usefulness  directly  in 
the  estimation  of  the  subject’s  latent  trait.  An  even  greater  usefulness  of  the  function  can  be  seen  in 
the  context  of  more  elaborated  methodologies,  as  is  exemplified  in  the  nonparametric  approach  to  the 
estimation  of  the  operating  characteristics  of  discrete  responses. 
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